Rank in Wordlist | Frequency | Word |
---|---|---|
4394 | 111 | 2,5 |
4478 | 109 | 1,5 |
6871 | 72 | 3,5 |
8094 | 61 | 0,5 |
9832 | 50 | 4,5 |
11091 | 44 | 7,5 |
12142 | 40 | 1,2 |
13685 | 35 | 6,5 |
14380 | 33 | 0,1 |
14802 | 32 | 1,6 |
Rank in Wordlist | Frequency | Word |
---|---|---|
171192 | 1 | .) |
Rank in Wordlist | Frequency | Word |
---|---|---|
2857 | 167 | 10% |
3257 | 146 | 50% |
3352 | 142 | 90% |
3804 | 128 | 20% |
4727 | 103 | 40% |
5284 | 93 | 60% |
5487 | 90 | 5% |
5550 | 89 | 30% |
5630 | 88 | 80% |
5766 | 86 | 25% |
Rank in Wordlist | Frequency | Word |
---|---|---|
28131 | 15 | R&B |
38959 | 10 | S&P |
41491 | 9 | AT&T |
45462 | 8 | B&H |
76329 | 4 | H&K |
90862 | 3 | A&R |
93645 | 3 | G&L |
118567 | 2 | AT&T-u |
125731 | 2 | H&E |
135881 | 2 | R&B-a |
Rank in Wordlist | Frequency | Word |
---|---|---|
140166 | 2 | USD$4,99 |
177952 | 1 | 29$/g |
280063 | 1 | US$1,3 |
280064 | 1 | US$1.1 |
280065 | 1 | US$10,8 |
280066 | 1 | US$100 |
280067 | 1 | US$100.670 |
280068 | 1 | US$14,89 |
280069 | 1 | US$38 |
280070 | 1 | US$5 |
Rank in Wordlist | Frequency | Word |
---|---|---|
268 | 1176 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
5775 | 86 | O'Sullivan |
6882 | 72 | Kur'ana |
9848 | 50 | Kur'an |
11358 | 43 | Kur'anu |
19783 | 23 | O'Sullivana |
20398 | 22 | .' |
21303 | 21 | O'Toole |
26648 | 16 | O'Brien |
31313 | 13 | Can't |
31410 | 13 | I'm |
Rank in Wordlist | Frequency | Word |
---|---|---|
63616 | 5 | 4+1+3 |
66013 | 5 | P+1 |
97567 | 3 | P+2 |
97568 | 3 | P+3 |
117177 | 2 | 2+4 |
117480 | 2 | 3+2+3 |
140174 | 2 | UTC+1 |
142116 | 2 | a+b |
154374 | 2 | n+1 |
171450 | 1 | 08279+5255 |
Rank in Wordlist | Frequency | Word |
---|---|---|
2669 | 178 | km/h |
3717 | 131 | i/ili |
7892 | 63 | m/s |
16592 | 29 | st/km² |
18709 | 25 | n/v |
19007 | 24 | 2/3 |
19671 | 23 | 1/3 |
20730 | 22 | km/s |
22047 | 20 | 2015/16. |
23524 | 19 | mg/kg |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots